Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
CNN quantization and compression strategy for edge computing applications
CAI Ruichu, ZHONG Chunrong, YU Yang, CHEN Bingfeng, LU Ye, CHEN Yao
Journal of Computer Applications    2018, 38 (9): 2449-2454.   DOI: 10.11772/j.issn.1001-9081.2018020477
Abstract1817)      PDF (944KB)(1110)       Save
Focused on the problem that the memory and computational resource intensive nature of Convolutional Neural Network (CNN) limits the adoption of CNN on embedded devices such as edge computing, a convolutional neural network compression method combining network weight pruning and data quantization for embedded hardware platform data types was proposed. Firstly, according to the weights distribution of each layer of the original CNN, a threshold based pruning method was illustrated to eliminate the weights that have less impact on the network processing accuracy. The redundant information in the network model was removed while the important connections were preserved. Secondly, the required bit-width of the weights and activation functions were analyzed based on the computational characteristics of the embedded platform, and the dynamic fixed-point quantization method was employed to reduce the bit-width of the network model. Finally, the network was fine-tuned to further compress the model size and reduce the computational consumption while ensuring the accuracy of model inference. The experimental results show that this method reduces the network storage space of VGG-19 by over 22 times while reducing the accuracy by only 0.3%, which achieves almost lossless compression. Meanwhile, by evaluating on multiple models, this method can reduce the storage space of the network model by a maximum of 25 times within the range of average accuracy lose of 1.46%, which proves the effective compression of the proposed method.
Reference | Related Articles | Metrics